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Abstract: A distributed algorithm is self-stabilizing if after faults and attacks hit the 
system and place it in some arbitrary global state, the system recovers from this catas- 
trophic situation without external intervention in finite time. In this paper, we consider 
the problem of constructing self-stabilizingly a maximal independent set in uniform unidi- 
rectional networks of arbitrary shape. On the negative side, we present evidence that in 
uniform networks, deterministic self-stabilization of this problem is impossible. Also, the 
silence property (i.e. having communication fixed from some point in every execution) is 
impossible to guarantee, either for deterministic or for probabilistic variants of protocols. 

On the positive side, we present a deterministic protocol for networks with arbitrary 
unidirectional networks with unique identifiers that exhibits polynomial space and time 
complexity in asynchronous scheduling. We complement the study with probabilistic proto- 
cols for the uniform case: the first probabilistic protocol requires infinite memory but copes 
with asynchronous scheduling, while the second probabilistic protocol has polynomial space 
complexity but can only handle synchronous scheduling. Both probabilistic solutions have 
expected polynomial time complexity. 
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L'Auto-stabilisation d'un Ensemble Maximal 
Independant dans les Reseaux Unidirectionels est 

Difficile 

Resume : Un algorithme distribue est auto-stabilisant si apres que des fautes et des at- 
taques ont frappe le systeme et l'ont place dans un etat global arbitraire, le systeme recupcrc 
en temps fini un fonctionnemcnt correct sans intervention exterieure. Dans cet article, 
nous considerons le probleme de la construction auto-stabilisante d'un ensemble maximal 
independant dans des reseaux uniformes et unidirectionels quelconques. Nous presentons un 
resultat negatif qui indique que dans les reseaux uniformes, l'auto-stabilisation deterministe 
de ce probleme est impossible a resoudre. De plus, la propriete de silence (i.e. garantir qu'a 
partir d'un point de chaque execution, les communications entre les noeuds du reseau sont 
fixees) est impossible a garantir, tant pour les variantes detcrministes que probabilistes des 
protocoles. 

Nos resultats positifs sont multiples. Nous presentons un protocole deterministe pour 
les reseaux unidirectionels identifies quelconques qui presente une complexity en temps et 
en espace qui reste polynomiale, avec un ordonnancement asynchrone. Nous complctons 
l'etude avec des protocoles probabilistes dans le cas uniforme : le premier protocole requiert 
une memoire infinie mais supporte un ordonnancement asynchrone, le dcuxicme protocole 
utilise une memoire polynomiale mais requiert un ordonnancement synchrone. Les deux 
protocoles ont une compexite moyenne en temps polynomiale. 

Mots-cles : Systemes distribues, Algorithme distribue, Ensemble Maximal Independant, 
Reseaux Unidirectionels, Auto-stabilisation, Auto-stabilisation probabiliste 



Stabilizing Maximal Independent Set in Unidirectional Networks is Hard 



3 



1 Introduction 

One of the most versatile technique to ensure forward recovery of distributed systems is 
that of self- stabilization [10, 11]. A distributed algorithm is self-stabilizing if after faults 
and attacks hit the system and place it in some arbitrary global state, the system recovers 
from this catastrophic situation without external (e.g. human) intervention in finite time. 

The vast majority of self-stabilizing solutions in the literature [11] considers bidirectional 
communications capabilities, i.e. if a process u is able to send information to another process 
v, then v is always able to send information back to u. This assumption is valid in many 
cases, but can not capture the fact that asymmetric situations may occur, e.g. in wireless 
networks, it is possible that u is able to send information to v yet v can not send any 
information back to u (u may have a wider range antenna than v). Asymmetric situations, 
that we denote in the following under the term of unidirectional networks, preclude many 
common techniques in self-stabilization from being used, such as preserving local predicates 
(a process u may take an action that violates a predicate involving its outgoing neighbors 
without u knowing it, since u can not get any input from its outgoing neighbors). 

Related works Self-stabilizing solutions are considered easier to implement in bidirec- 
tional networks since detecting incorrect situations requires less memory and computing 
power [3], recovering can be done locally [2], and Byzantine containment can be guaran- 
teed [17, 18, 20]. 

Investigating the possibility of self-stabilization in unidirectional networks was recently 
emphasized in several papers [1, 6, 7, 8, 13, 14, 9, 15, 5]. However, topology or knowledge 
about the system varies: [7] considers acyclic unidirectional networks, where erroneous initial 
information may not loop; [1, 6, 9, 13] assume unique identifiers and strongly connected so 
that global communication can be implemented; [8, 14, 15] makes use of distinguished 
processes yet operate on arbitrary unidirectional networks. 

Tackling arbitrary uniform unidirectional networks in the context of self-stabilization 
proved to be hard. In particular, [5, 4] studied the self-stabilizing vertex coloring problem 
in unidirectional uniform networks (where adjacent nodes must ultimately output different 
colors). Deterministic and probabilistic solutions to the vertex coloring problem [16, 19] in 
bidirectional networks have local complexity (A states per process are required, and O(A) 
-resp. O(l)- actions per process are needed to recover from arbitrary state in the case of a 
deterministic -resp. probabilistic- algorithm). By contrast, in unidirectional networks, [5] 
proves a lower bound of n states per process (where n is the network size) and a recovery 
time of at least n(n — l)/2 actions in total (and thus Q(n) actions per process) in the case 
of deterministic uniform algorithms, while [4] provides a probabilistic solution that remains 
cither local in space or local in time, but not both. 

Our contribution In this paper, we consider the problem of constructing self-stabilizingly 
a maximal independent set in uniform unidirectional networks of arbitrary shape. It turns 
out that local maximization (i.e. maximal independent set) is strictly more difficult than 
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local predicate maintainance {i.e. vertex coloring). On the negative side, we present evidence 
that in uniform networks, deterministic self-stabilization of this problem is impossible. Also, 
the silence property {i.e. having communication fixed from some point in every execution) 
is impossible to guarantee, either for deterministic or for probabilistic variants of protocols. 

On the positive side, we present a deterministic protocol for networks with arbitrary 
unidirectional networks with unique identifiers that exhibits 0{m log n) space complexity 
and O(D) time complexity in asynchronous scheduling, where n is the network size and 
D is the network diameter. We complement the study with probabilistic protocols for 
the uniform case: the first probabilistic protocol requires infinite memory but copes with 
asynchronous scheduling (stabilizing in time 0(logn+logl?+-D), where i denotes the number 
of fake identifiers in the initial configuration), while the second probabilistic protocol has 
polynomial space complexity (in O(mlogn)) but can only handle synchronous scheduling 
(stabilizing in time 0{{n + I) logn)). 

Outline The remaining of the paper is organized as follows: Section 2 presents the pro- 
gramming model and problem specification. Section 3 presents our negative results, while 
Section 4 details the protocols. Section 5 gives some concluding remarks and open questions. 

2 Preliminaries 

Program model A program consists of a set V of n processes. A process maintains a 
set of variables that it can read or update, that define its state. A process contains a set 
of constants that it can read but not update. A binary relation E is defined over distinct 
processes such that G E if and only if j can read the variables maintained by i; i is 

a predecessor of j, and j is a successor of i. The set of predecessors (resp. successors) of i 
is denoted by PA (resp. SA), and the union of predecessors and successors of i is denoted 
by NA, the neighbors of i. The ancestors of process i is recursively defined as follows: 
predecessors of i are ancestors of i, and ancestors of each predecessor of i are also ancestors 
of i. The descendants of i are similarly defined using successors (instead of predecessors). 

For processes i and j in V, d(i,j) denotes the distance (or the length of the shortest 
path) from i to j in the directed graph (V,E). We define, for convenience, the distance 
as d(i,i) = and d(i,j) = oo if i is not reachable to j. The diameter D is defined as 
D = max{d(i, j) | eVx V,d(i,j) ^ oo}. 

An action has the form (name) : (guard) — ► (command). A guard is a Boolean 
predicate over the variables of the process and its predecessors. A command is a sequence 
of statements assigning new values to the variables of the process. We refer to a variable 
v and an action a of process i as vA and a.i respectively. A parameter is used to define a 
set of actions as one parameterized action. Notice that actions of a process are completely 
independent of its successors. 

A configuration of the program is the assignment of a value to every variable of each 
process from its corresponding domain. Each process contains a set of actions. In some 
configuration, an action is enabled if its guard is true in the configuration, and a process 
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is enabled if it has at least one enabled action in the configuration. A computation is a 
maximal sequence of configurations 70,71, • • • such that for each configuration 7,, the next 
configuration 7 i+ i is obtained by executing the command of at least one action that is 
enabled in 7$. Maximality of a computation means that the computation is infinite or it 
terminates in a configuration where none of the actions are enabled. A program that only 
has terminating computations is silent. 

A scheduler is a predicate on computations, that is, a scheduler is a set of possible 
computations, such that every computation in this set satisfies the scheduler predicate. We 
consider only weakly fair schedulers, where no process can remain enabled in a computation 
without executing any action. We distinguish three particular schedulers in the sequel of 
the paper: the distributed scheduler corresponds to predicate true (that is, all weakly fair 
computations are allowed). The locally central scheduler implies that in any configuration 
belonging to a computation satisfying the scheduler, no two enabled actions are executed 
simultaneously on neighboring processes. The synchronous scheduler implies that in any 
configuration belonging to a computation satisfying the scheduler, every enabled process 
executes one of its enabled actions. 

The distributed and locally central schedulers model asynchronous distributed systems. 
In asynchronous distributed systems, time is usually measured by asynchronous rounds 
(simply called rounds). Let E = 70, 71, ... be a computation. The first round of E is the 
minimum prefix of E, E\ = 70, 7i, • • • , 7fc, such that every enabled process in 70 executes 
its action or becomes disabled in E\. Round t (t > 2) is defined recursively, by applying 
the above definition of the first round to E' — jk,lk+i, ■ ■ ■■ Intuitively, every process has a 
chance to update its state in every round. 

A configuration conforms to a predicate if this predicate is true in this configuration; 
otherwise the configuration violates the predicate. By this definition every configuration 
conforms to predicate true and none conforms to false. Let R and S be predicates over the 
configurations of the program. Predicate R is closed with respect to the program actions 
if every configuration of the computation that starts in a configuration conforming to R 
also conforms to R. Predicate R converges to S if R and S are closed and any computation 
starting from a configuration conforming to R contains a configuration conforming to S. The 
program deterministically stabilizes to R if and only if true converges to R. The program 
probabilistically stabilizes to R if and only if true converges to R with probability 1. 

Problem specification Each process i defines a function mis.i that takes as input the 
states of i and its predecessors, and outputs a value in {true, false}. The unidirectional 
maximal independent set (denoted by UMIS in the sequel) predicate is satisfied if and only 
if for every i G V, either mis.i — true A Vj G N.i, mis.j = false or mis.i = false A 3j G 
N.i, mis.j = true. 
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(a) System A (b) System B 



Figure 1: Impossibility of self-stabilizing UMIS 

3 Impossibility Results in anonymous networks 

In this section, we consider anonymous and uniform networks, where processes of the same 
in-degree execute exactly the same code (note however that probabilistic protocols may 
exhibit different actual behaviors when making use of a random variable). 

Theorem 1 There exists no silent self- stabilizing solution for the UMIS problem. 

Proof. Assume there exists such a solution and consider System A as depicted in 
Figure I. (a). Since the protocol is silent, it reaches a terminal configuration where exactly 
one of the three processes, says a, has mis. a = true. Now consider the system in Figure 1.(6), 
with the states of the processes in the tail (that is b' and c') being the same as those in the 
3-cycle (that is b and c). Both processes with state S' (b and b') have the same in-degree and 
the same predecessor; as the one in the cycle (6) is silent, the second one (&') is also silent. 
Both processes with state S" (c ad c') have the same in-degree and the same predecessor 
state; as the one in the cycle (c) is silent, the second one (c') is also silent. As a result, both 
processes b' and c' in the tail of System B never move. Since the UMIS function is based 
solely on the current state, in-degree, and predecessor state, the UMIS function returns the 
same result for both processes b and b' in state S' and for both processes c and c' in state 
S". So, both processes b' and c' in the tail are not in the UMIS. Overall, System B describes 
a terminal configuration that is not a maximal independent set (the UMIS predicate does 
not hold at c'). □ 

Notice that the impossibility results of Theorem 1 holds even for probabilistic potential 
solutions. We now prove that relaxing the silence property still prevents the existence of 
deterministic solutions. 

Theorem 2 There exists no deterministic self- stabilizing solution for the UMIS problem. 

Proof. Assume there exists such a solution and consider the two systems A and 
B that are depicted in Figure 1. We consider a computation of system A, that eventually 
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ends up in a stable output of the mis function for all processes a, b, and c (a being the one 
process with mis. a = true), and construct a sibling execution in System B as follows: 

• processes b and b' (resp. c and d) in System B have the same initial states as b (resp. 
c) in System A, 

• anytime process b (resp. c) is executed in System A, both processes b and 6' (resp. c 
and c') are executed in System B, 

• anytime a is executed in System A, a is also executed in System B. 

Now, at any time, in System B, both processes b and b' are in the same state, with the 
same predecessors' states. As a result, the output of their mis function is the same. The 
same holds for processes c and c'. Since System A eventually ends up in a configuration from 
which all mis functions are stable, the same holds for system B, where mis.b' and mis.c' 
both return false. As a result, a UMIS is never constructed in System B. □ 

4 Possibility Results 

The previous impossibility results yield that for the deterministic case, only non uniform 
networks admit a self-stabilizing solution for the UMIS problem. In section 4.1, we present 
such a deterministic solution. 

For anonymous and uniform networks, there remains the probabilistic case. We proved 
that probabilistic yet silent solutions are impossible, so both our solutions are non-silent. 
The one that is presented in Section 4.2 performs in asynchronous networks but requires 
unbounded memory, while the one that is presented in Section 4.3 performs in synchronous 
networks and uses 0(m log n) memory per process. 

4.1 Deterministic solution with identifiers 

The intuition of the solution is as follows. Every process collects the predecessor information 
from all of its ancestors using the self-stabilizing approach given in [9, 12, 15]. From the 
collected information, each process i can reconstruct the exact topology of the subgraph 
consisting of all the ancestors and i itself. Then, depending on where the process is located, 
two possibilities can be considered: 

1. The process is in a strongly connected component that includes all of its ancestors. 
In the directed acyclic graph of strongly connected components, this process is in a 
source component. Then every process in the source component constructs the same 
topology. The MIS in this source component is constructed for example by giving 
processes priority in the descending order of identifiers (i.e., the process with maximal 
identifier has highest priority). 
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2. The process is in a non-source strongly connected component in the same acyclic graph 
of strongly connected components. Then, the same process as in the previous situation 
repeats, with the additional constraint that stronger priority is given to the processes 
in the upwards strongly connected components. 

The detailed algorithm is given in Algorithm 4.1. 

Algorithm 4.1 Deterministic UMIS algorithm in asynchronous networks with identifiers 

constants of process i 

id,: identifier of i; 
P^. identifier set of P.i; 
variables of process i 

Topologyi: set of (id, ID, dist) tuples; / / topology that i is currently aware of. 
1 1 id: & process identifier 
// ID: identifier set of P. (id) 
1 1 dist: distance from id to i in Topologyi. 
function 

update( Topologyi ) 

Topologyi :— {(idi,Pi,0)}U{Jj ePi {(id,ID,dist + I) | (id, ID, dist) ETopologyj}; 
while 3(id, ID, dist), (id' , ID' , dist') G Topologyi s.t. id = id' and dist < dist' 

remove (id' , ID' , dist') from Topologyi; 
while 3(id, ID, dist), (id', ID', dist') G Topology, s.t. id = id' and ID ^ ID' 

remove one of them (arbitrarily) from Topologyi ; 
while 3(id, ID, dist) G Topologyi s.t. id is unreachable to i in Topologyi 

remove (id, ID, dist) from Topologyi; 
UMIS(Topologyi) 

Working Tpi := Topologyi; 
UMIS, := 

while 3(idi, Pi,0) G WorkingTpi { 

Let W be a source strongly connected component of WorkingTpi, 
for each id G W in the descending order of identifiers 
if UMISi U {id} is an independent set 
UMIS, := UMIS, U {id}; 
WorkingTpi :— WorkingTpi — W; 

} 

if id, G UMISi 

output true; 
else 

output false; 
actions of process i 

true — ► update( Topologyi); UMIS( Topologyi); 
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Lemma 1 Let i be any process. At the end of the k-th round (k > 1) and later, the topology 
stored in variable Topologyi is correct up to distance k — 1 : 

1. for every process j with d(j, i) < k— 1, Topologyi stores the correct tuple (j, P.j, d(j, i)) 
of j, and 

2. every tuple (id,ID,d) G Topologyi is the correct one (j, P.j,d(j,i)) of some process j 
ifd<k-l. 

Proof. We prove the lemma by induction on k. Let us observe first that the lemma 
holds for k = 1 (inductive basis): Once i executes its action, Topologyi always contains 
(i, P.i, 0) and any other tuple (id, ID, d) satisfies d > 1. 

Assuming that the lemma holds for k (inductive hypothesis), we now prove the lemma 
for fc + 1 (inductive step). Any process u with d(u, i) < k satisfies d(u, j) = d(u, i) — 1 < fc — 1 
for some predecessor j of i. From the inductive hypothesis, Topologyj contains the correct 
tuple (u,P.u,d(u,j)) of u at the end of the fc-th round and later. Thus, i reads the correct 
tuple (u,P.u,d(u,j)) in Topologyj and updates its distance correctly at every action in the 
(fc + l)-th round and later. The hypothesis also implies that any tuple (u,ID,d) contained 
in Topology v of any predecessor v of i after the end of the fc-th round satisfies d > d(u, — l 
and is correct if d = d(u,i) — 1. Thus, the correct tuple (u, P.u,d(u,i)) is never removed 
from Topologyi in the (fc+ l)-th round or later. The first claim of the lemma holds for fc+ 1. 

Existence of tuple (id,ID,d) (d ^ 0) in Topologyi at the end of the (fc + l)-th round 
or later implies that i reads (id,ID,d — 1) in Topologyj of some predecessor j of i. From 
the hypothesis, any tuple (id,ID,d — 1) contained in Topologyj after the end of the fc-th 
round is correct (or id is an identifier of a really existing process, say v, ID is the identifier 
set of P.v and d = d(v,j) holds) if d — 1 < k — 1. Thus, any tuple (id, ID, d) contained in 
Topologyi at the end of the (fc + l)-th round or later is correct if d < fc. The second claim 
of the lemma holds for fc + 1. □ 

The following corollary is derived from Lemma 1. 

Corollary 1 Let i be any process and D(i) be the maximum distance to i from all the 
ancestors of i. At the end of the (D(i) + l)-th round and later, Topologyi stores the exact 
topology of the subgraph consisting of all the ancestors of i and i itself. 

Proof. Concerning Topologyi at the end of the (D(i) + l)-th round and later, Lemma 
1 shows that the correct tuple (u, P u , d(u, i)) of every ancestor u of i is contained, and any 
tuple (id,ID,d) with d < D(i) is correct. This implies that Topologyi at the end of the 
(D(i) + l)-th round and later can contain no tuple (id,ID,d) with d > D(i) since the 
process with identifier id is not reachable to i in Topologyi and such a tuple is removed from 
Topologyi if exists. Thus the corollary holds. □ 

Theorem 3 Algorithm 4-1 presents a self-stabilizing deterministic UMIS algorithm in asyn- 
chronous networks with identifiers. Its convergence time is D + 1 rounds where D is the 
diameter of the network, and the memory space used at each node is O(mlogn) bits. 
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Proof. Let Topology be the exact topology of the network. It is obvious that 
UMIS(Topology) correctly finds a UMIS when executed until WorkingTP = holds. When 
Topologyi stores the exact topology of the subgraph consisting of all ancestors of i, UMIS( Topologyi ) 
selects i as a member of UMIS iff UMIS(Topology) selects i: whether process i is selected by 
UMIS( Topology) depends only on the topology of the subgraph consisting of all ancestors of 
i. Corollary 1 guarantees that Topologyi of every process i stores such exact topology at the 
end of the (D+ l)-th round and later, and thus, the theorem holds. As the Topology variable 
may end up in containing an entry for every node, the over space needed is O(mlogn) bits 
per process. □ 

Notice that Algorithm 4.1 enables each process i to know eventually the exact topology 
of the subgraph consisting of all the ancestors of i. Algorithm 4.1 can be easily extended so 
that each process can eventually get the exact topology containing the input values of the 
ancestors if each process has a static input value. Such an extension results in a universal 
scheme since it can solve any non-reactive problem that is consistently solvable at each 
process using the topology and the input values of its ancestors. 

Another observation is that Algorithm 4.1 can easily be modified to become silent. For 
simplicity of our presentation, every process always has an enabled action with guard true, 
and thus, Algorithm 4.1 is not silent. But, Algorithm 4.1 becomes silent by changing the 
guard so that the action becomes enabled only when Topologyi needs to be updated. 



4.2 Probabilistic solution with unbounded memory in asynchronous 
anonymous networks 

In this subsection, we present a probabilistic self-stabilizing UMIS algorithm for asyn- 
chronous anonymous networks. The solution is based on a probabilistic unique naming 
of the processes and a deterministic UMIS algorithm that assumes unique process identi- 
fiers. In the naming algorithm, each process is given a name variable that can be arbitrary 
large (thus the unbounded memory requirement). The naming is unique with probability 1 
after a bounded number of new name draws. The new name draw consists in appending a 
random bit at the end of the current identifier. Each time the process is activated, a new 
random bit is appended. In parallel, we essentially run the deterministic UMIS algorithm 
presented in the previous subsection. The main difference from the previous algorithm is in 
handling the process identifiers. The variable Topology of a particular process may contain 
several different identifiers of a same process since the identifier of the process continues to 
get longer and longer in every execution of the protocol. To circumvent the problem, we 
consider two distinct identifiers to be the same if one is a prefix of the other, and anytime 
such same identifiers conflict, only the longest one is retained. Another difference is that 
we do not need the distance information. The distance information is used in the previ- 
ous algorithm to remove the fake tuples (i, ID, d) of process i such that ID ^ P.i, which 
may exist in the initial configuration. In our scheme, tuples with fake identifiers that are 
prefixes of identifiers of real processes are eventually removed in Algorithm 4.2 since the 
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correct identifier eventually becomes longer than any fake identifier. Other tuples with fake 
identifiers are eventually disconnected from the constructed subgraph topology. 

The details of the algorithm are given in Algorithm 4.2; only the topology update part 
is described since the UMIS function is the same as in Algorithm 4.1. 



Algorithm 4.2 Probabilistic UMIS algorithm in asynchronous anonymous networks 

variables of process i 

idc identifier (binary string) of i; 
Pi: identifier set of P.i; 

Topologyi: set of (id, ID) tuples; // topology that i is currently aware of. 
/ / id: a process identifier 
// ID: identifier set of P. (id) 
function 

update ( Topologyi ) 

idi := append(idi, random_bit); / / append a random bit to the current id 

Topology, := {(idi, Pi)} U|J jePi Topologyf, 

while 3(id,ID), (id', ID') G Topologyi s.t. id' is a prefix of id 

remove (id', ID') from Topologyi; 
while 3(id,ID) € Topologyi s.t. id is unreachable to i in Topologyi 

remove (id, ID) from Topologyi; 



Theorem 4 Algorithm 4-2 presents a self-stabilizing probabilistic UMIS algorithm in asyn- 
chronous anonymous networks. Its expected convergence time is O(\ogn + \ogl + D) rounds 
where D is the diameter of the network and I is the number of fake identifiers in the initial 
configuration. 

Proof Sketch: It is clear that the identifier of any process eventually becomes distinct from 
any other's with probability 1. We first show that every process has a unique identifier in 
O(logn) expected rounds. 

We consider, as the worst-case scenario, the case where all processes start with the same 
identifier and each process appends only a single bit to its identifier in every round. 

The probability that every process has a unique identifier at the end of round k (i.e., 
n random strings of k bits are mutually distinct) is evaluated as follows when n is small 
compared to 2 k : 

n—l . n— 1 . ( 1\ 2 

n^ 1 ^) ~ n ex ^-^) = ex p(- n ™ k+1 ) « e*p(-2£n-) 

i=l i=l 

We introduce a discrete random variable X to represent the number of rounds required until 
every process has a unique identifier. When we consider the execution after round 2 logn to 
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guarantee n is small compared to 2 fe , the expected number of rounds is then bounded by 



oo oo 2 

21ogn + J2 Prob(X > i) = 21ogn+ i 1 ~ ex P(-^+i)) 

i— 21ogn i— 21ogn 

oc 2 

«21ogn+ ^+i^ 2l °Sn + 0(l) 

i—2 log n 

Thus, every process has a unique identifier in expected O(logn) rounds. 

Processes may still have same identifiers as those contained in fake tuples. By a similar 
argument to the above, we can see additional 0(\og£) expected rounds are sufficient to 
give each process an identifier distinct from any fake one. Then, all the fake identifiers are 
removed from Topologyi of each process i since such identifiers either become unreachable 
to i in Topologyi or become prefixes of real indentificrs. 

After all identifiers become distinct from one another, the topology stored in Topologyi 
of each process i becomes stable if the process identifiers are ignored (i.e., only process 
identifiers get longer and longer). On the other hand, once the identifier of a process u 
becomes lexicographically larger than that of a process v, u's identifier is lexicographically 
larger than u's afterward. This guarantees that every execution of UMIS(Topologyi) at 
process i after some point returns the same result concerning whether process i is a member 
of the UMIS or not. By similar discussion to the proof of Theorem 3 we can show that 
additional O(D) rounds are sufficient to get the stable UMIS solution once every process 
has a unique identifier. 

Consequently, Algorithm 4.2 presents a self-stabilizing probabilistic UMIS algorithm and 
its expected convergence time is 0(logn + log^ + D) rounds. □ 



4.3 Probabilistic solution with bounded memory in synchronous 
anonymous networks 

The algorithm in the previous section is based on global unique naming, however, self- 
stabilizing global unique naming in unidirectional networks inherently requires unbounded 
memory. The goal of this subsection is to achieve, with bounded memory, a local unique 
naming that gives each process an identifier that is different from that of any of its ancestors, 
and to compute a UMIS based on the previously computed local naming. Indeed, such a 
local naming is sufficient for each process to recognize the strongly connected component 
it belongs to. Once the component is recognized, a UMIS can be computed by a method 
similar to that of the deterministic algorithm presented in Section 4.1. 

In our scheme to achieve local unique naming, each process extends its identifier by 
appending a random bit when it finds an ancestor with the same identifier as its own. To 
be able to perform such a detection, a process needs to distinguish any of its ancestors 
from itself even when they have the same identifier. The detection mechanism is basically 
executed as follows: each process draws a random number, and disseminates its identifier 
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together with the random number to its descendants. When process i receives the same 
identifier as its own, it checks whether the attached random number is same as its own. If 
they are different, the process detects that this is a distinct process (that is, a real ancestor) 
with the same identifier as its own current identifier. When the process receives the same 
identifier with the same random number as its own for a given period of time, it draws a 
new random number and repeats the above procedure. Hence, as two different processes 
eventually draw different random numbers, eventually every process is able to detect an 
ancestor with the same identifier if such an ancestor exists. 

The above method may cause false detection (or false positive) when a process receives its 
own identifier but with an old random number. To avoid such false detection, each identifier 
is relayed with a distance counter and is removed when the counter becomes sufficiently 
large. Moreover, the process repeats the detection checks while keeping sufficiently long 
periods of time between them. The details of the self-stabilizing probabilistic algorithm for 
the local naming are presented in Algorithm 4.3. 



Algorithm 4.3 Probabilistic local naming in synchronous anonymous networks 

variables of process i 

idf. identifier (binary string) of i; 

rndii random number selected from {1,2, ... ,k}; / / k (> 2) is a constant 
IDf. set of (id,rnd,dist) tuples; // identifiers that i is currently aware of. 

/ / id: a process identifier 

/ / rnd: random number of P. (id) 

1 1 dist: distance that id traverses 
function 

update(IDi) 

IDi := {(idi,rndi,0)} li{Jj ePi {(id, rnd, dist + 1) \ (id, rnd, dist) ElDj}; 
while 3(id, rnd, dist) E IDi s.t. dist > \{id \ (id,*,*) G/Dj}|; 

remove (id, rnd, dist) from IDf, 
if timer > \{id | (id, *, *) GlDi}\ / j timer is incremented by one every round 
naming(ID i ) 
naming (IDi) 

if 3(idi, rnd, *) £ IDi s.t. rnd ^ rndi 

idi := append(idi, randomJait); j j append a random bit to the current id 
rndi '■= number randomly selected from {1,2, ... ,k}; 
reset-timer; // reset timer to 
update(IDi); 
actions of process i 
true — > update(IDi); 
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Lemma 2 Algorithm 4-3 presents a self-stabilizing probabilistic local naming algorithm in 
synchronous anonymous networks. Its expected convergence time is 0((n + £) log n) rounds 
where i is the number of fake identifiers in the initial configuration. 

Proof sketch: First we show that the algorithm is a self-stabilizing probabilistic local 
naming algorithm. For contradiction, we assume that two processes i and j (where j is 
an ancestor of i) keep a same identifier after a configuration. Without loss of generality, 
the distance from j to i is minimum among process pairs keeping same identifiers. Let 
j, ui,u 2 , • ■ • ,u m ,i be the shortest path from j to i. Since all processes in the path have 
mutually distinct identifiers except for a pair i and j, (idj,rndj) is not discarded in the 
intermediate processes and is delivered to i. Thus, eventually i detects idi — idj and 
rndi ^ rndj. Then i extends its identifier by adding a random bit, which is a contradiction. 

We evaluate the expected convergence time of the algorithm. By similar argument to 
the proof of Theorem 4, we can show that the expected number of bits added to a process 
identifier is O(logn). Notice that the number I of fake identifiers has no influence to the 
evaluation, for the distance dist of a fake identifier is larger than the timer value (once 
the timer is reset) and thus is removed (because of dist > \{id \ (id,*,*) G IDi}\) when 
function naming is executed. On the other hand, in the scenario where all processes start 
with a same identifier, the time between two executions of function naming at a process is 
0(n + £). Thus, the expected convergence time is 0((n + £) logn) rounds. □ 

Algorithm 4.4 presents a self-stabilizing UMIS algorithm in locally-named networks. 
Thus, the fair composition [11] of the algorithm with the local- naming algorithm in Algorithm 
4.3 provides a self-stabilizing UMIS algorithm in synchronous anonymous networks. For 
simplicity, we omit the code for removing fake initial information in Algorithm 4.4 since 
such fake initial information can be removed in a similar way to Algorithm 4.3. 

Lemma 3 In the algorithm presented in Algorithm 4-4> each process can exactly recognize 
the topology of the strongly connected component it belongs to in O(D) rounds where D is 
the diameter of the network. 

Proof sketch: It is obvious that variable Topologyi of each process i after D rounds consists 
of tuples (id, P. (id)) from all the ancestors of i . Notice that the local naming allows two 
distinct processes to have a same identifier if they are mutually unreachable. Thus, Topologyi 
may contain a same tuple (id, P) of two or more distinct processes and/or may contain two 
tuples (id,P) and (id,P') with a same id but different predecessor sets P and P' . 

Each process constructs the following graph Gi = (Vi, Ei): Vi = {id \ (id, *) G Topology i\ 
and Ei — {(u,v) \ (v,P) G Topology i s.t. u G P}. In other words, Gi can be obtained from 
the actual graph G as follows: First consider the subgraph G[ induced by the ancestors of i 
and i itself, and then merge the processes with the same identifier into a single process. 

It is obvious that all processes in Gi are reachable to i. What we have to show is that 
process j is reachable from i in Gi (or j belongs to the strongly connected component of i) if 
and only if j is also reachable from j in G-. The if part is obvious since Gi is obtained from 
G'i by merging processes. The only if part holds as follows. Consider two distinct processes j 
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Algorithm 4.4 UMIS algorithm in locally-named networks 

constants of process i 

idf. identifier of i; // distinct from that of any ancestor 
P^. identifier set of P.i; 
variables of process i 

umisi. boolean; / / true iff i is a UMIS node 

Topologyi: set of {id, ID) tuples; // topology that i is currently aware of. 

// id: a process identifier 

// ID: identifier set of P. (id) 
Compi : identifier set //of processes in the strongly-connected component of i 
function 

update ( Topologyi ) 

Topology, := {(idi, Pi)} u Uf e p i Topology y, 
UMIS (Topology,) 

Compi :— {id | id is reachable from i in Topologyi}; 

1 1 set of processes in the strongly connected component of i 

UMIS V := 0; 

if 3j G P.i— Compi s.t. umisj = true or 3j <E Compi s.t. (j > i and umisj = true) { 
umisi := false; output false; 

} 

else { 

umisi := true; output true; 

} 

actions of process i 

true — > update(Topologyi); UMIS (Topologyi); 



and j' with a same identifier if exist. Since they are mutually unreachable but are reachable 
to i, they are unreachable from i in (otherwise one of them is reachable from the other). 
In construction of d from merging is applied only to processes unreachable from i, that 
is, the merging has no influence on reachability from i. Thus, any process unreachable from 
i in G\ remains unreachable from i in Gj. □ 

Lemma 4 Algorithm presented in Algorithm 4-4 * s a self- stabilizing (deterministic) UMIS 
algorithm in (asynchronous) locally-named networks. Its convergence time is 0(n) rounds. 

Proof sketch: First from Lemma 3, every process correctly recognizes in O(D) rounds all 
the processes in the same connected component. Then consider a source strongly connected 
component. The process with the maximum identifier in the component becomes a stable 
UMIS member. After that the UMIS outputs of processes in the component become stable 
one by one in the descending order of identifiers. It takes at most 0(n') rounds until all 
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the processes in the component become stable, where n' is the number of processes in the 
component. 

The same argument can be applied to a source strongly connected component in the 
graph obtained from G by removing the components with stabilized UMIS outputs. By 
repeating the argument, we can show that the UMIS outputs of all the processes become 
stable in 0(n) rounds. It is clear that the processes with the UMIS outputs of true form a 
UMIS. □ 

From Lemmas 2 and 4, the following theorem holds. 

Theorem 5 Fair composition of algorithms presented in Algorithm 4-3 and Algorithm 4-4 
provides a self-stabilizing probabilistic UMIS algorithm in synchronous anonymous networks. 
Its expected convergence time is 0((n + l) logn) rounds where I is the number of fake identi- 
fiers in the initial configuration. The space complexity of the resulting protocol is 0(n log n). 

5 Conclusion 

Although in bidirectionnal networks, self-stabilizing maximal independent set is as difficult 
as vertex coloring [16], this work proves that in unidirectionnal networks, the computing 
power and memory that is required to solve the problem varies greatly. Silent solutions to 
unidirectional uniform networks coloring require 0(logn) (resp. 0(log<5), where S denotes 
the maximal degree of the communication graph) bits per process and have stabilization 
time 0(n 2 ) (resp. 9(1)) when deterministic (resp. probabilistic) solutions are considered. 
By contrast, deterministic maximal independent set construction in uniform networks is 
impossible, and silent maximal independent set construction is impossible, regardless of the 
deterministic or probabilistic nature of the protocols. 

While we presented positive results for the deterministic case with identifiers, and the 
non-silent probabilistic cases, there remains the immediate open question of the possibility 
to devise a probabilistic solution with bounded memory in asynchronous setting. 

Another interesting issue for further research related to global tasks. The global unique 
naming that we present in section 4.2 solves a truly global problem in networks where 
global communication is not feasible, by defining proper equivalences classes between various 
identifiers. The case of other classical global tasks in distributed systems (e.g. leader 
election) is worth investigating. 
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